Efficient Error Setting for Subspace Miners

نویسندگان

  • Eran Shaham
  • David Sarne
  • Boaz Ben-Moshe
چکیده

A typical mining problem is the extraction of patterns from subspaces of multidimensional data. Such patterns, known as a biclusters, comprise subsets of objects that behave similarly across subsets of attributes, and may overlap each other, i.e., objects/attributes may belong to several patterns, or to none. For many miners, a key input parameter is the error used which greatly affects the quality, quantity and coherency of the mined clusters. As the error is dataset dependent, setting it demands either domain knowledge or some trial-and-error. The paper presents a new method for automatically setting the error to the value that maximizes the number of clusters mined. This error value is strongly correlated to the value for which performance scores are maximized. The correlation is extensively evaluated using six datasets, two mining algorithms, seven prevailing performance measures, and compared with five prior literature methods, demonstrating a substantial improvement in the mining score.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Isotropic Constant Dimension Subspace Codes

 In network code setting, a constant dimension code is a set of k-dimensional subspaces of F nq . If F_q n is a nondegenerated symlectic vector space with bilinear form f, an isotropic subspace U of F n q is a subspace that for all x, y ∈ U, f(x, y) = 0. We introduce isotropic subspace codes simply as a set of isotropic subspaces and show how the isotropic property use in decoding process, then...

متن کامل

تخمین جهت منابع با استفاده از زیرفضای ختری-رائو

This paper deals with Direction of Arrival (DOA) Estimation using Uniform linear array (ULA) for the case of more sources than sensors in the array processing. Khatri-Rao subspace approach, introduced for DOA estimation for this, in non-stationary signal model. The technique will be shown to be capable to handle stationary signals, too. Identifiability conditions of this approach are addressed....

متن کامل

USING SUBSPACE METHODS FOR ESTIMATING ARMA MODELS FOR MULTIVARIATE TIME SERIES WITH CONDITIONALLY HETEROSKEDASTIC INNOVATIONS By

This paper deals with the estimation of linear dynamic models of the ARMA type for the conditional mean for time series with conditionally heteroskedastic innovation process widely used in modelling financial time series. Estimation is performed using subspace methods which are known to have computational advantages as compared to prediction error methods based on criterion minimization. These ...

متن کامل

A simple efficient density estimator that enables fast systematic search

This paper introduces a simple and efficient density estimator that enables fast systematic search. To show its advantage over commonly used kernel density estimator, we apply it to outlying aspects mining. Outlying aspects mining discovers feature subsets (or subspaces) that describe how a query stand out from a given dataset. The task demands a systematic search of subspaces. We identify that...

متن کامل

Solution of the Time-Domain Inverse Resistivity Problem in the Model Reduction Framework Part I. One-Dimensional Problem with SISO Data

Many time-domain problems in engineering applications can be described by means of a parameter dependent time-invariant dynamic systems. We are interested in parameter estimation, by fitting available transient measurements using the nonlinear least square method. As the main application, we consider the control source electromagnetic method (CSEM) of geophysical exploration governed by the dif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014